Overview

Dataset statistics

Number of variables15
Number of observations500000
Missing cells125000
Missing cells (%)1.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory57.2 MiB
Average record size in memory120.0 B

Variable types

Numeric8
Categorical6
DateTime1

Alerts

age has 25000 (5.0%) missing valuesMissing
gender has 25000 (5.0%) missing valuesMissing
employment_type has 25000 (5.0%) missing valuesMissing
annual_income has 25000 (5.0%) missing valuesMissing
credit_score has 25000 (5.0%) missing valuesMissing
customer_id is uniformly distributedUniform
customer_id has unique valuesUnique
repayment_history has 67883 (13.6%) zerosZeros

Reproduction

Analysis started2026-02-23 04:17:27.089989
Analysis finished2026-02-23 04:17:53.897472
Duration26.81 seconds
Software versionydata-profiling vv4.18.1
Download configurationconfig.json

Variables

customer_id
Real number (ℝ)

Uniform  Unique 

Distinct500000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean349999.5
Minimum100000
Maximum599999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:54.134467image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum100000
5-th percentile124999.95
Q1224999.75
median349999.5
Q3474999.25
95-th percentile574999.05
Maximum599999
Range499999
Interquartile range (IQR)249999.5

Descriptive statistics

Standard deviation144337.71
Coefficient of variation (CV)0.41239405
Kurtosis-1.2
Mean349999.5
Median Absolute Deviation (MAD)125000
Skewness8.7177233 × 10-17
Sum1.7499975 × 1011
Variance2.0833375 × 1010
MonotonicityStrictly increasing
2026-02-23T09:47:54.230447image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000001
 
< 0.1%
1000011
 
< 0.1%
1000021
 
< 0.1%
1000031
 
< 0.1%
1000041
 
< 0.1%
1000051
 
< 0.1%
1000061
 
< 0.1%
1000071
 
< 0.1%
1000081
 
< 0.1%
1000091
 
< 0.1%
Other values (499990)499990
> 99.9%
ValueCountFrequency (%)
1000001
< 0.1%
1000011
< 0.1%
1000021
< 0.1%
1000031
< 0.1%
1000041
< 0.1%
1000051
< 0.1%
1000061
< 0.1%
1000071
< 0.1%
1000081
< 0.1%
1000091
< 0.1%
ValueCountFrequency (%)
5999991
< 0.1%
5999981
< 0.1%
5999971
< 0.1%
5999961
< 0.1%
5999951
< 0.1%
5999941
< 0.1%
5999931
< 0.1%
5999921
< 0.1%
5999911
< 0.1%
5999901
< 0.1%

age
Real number (ℝ)

Missing 

Distinct49
Distinct (%)< 0.1%
Missing25000
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean45.011236
Minimum21
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:54.322046image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile23
Q133
median45
Q357
95-th percentile67
Maximum69
Range48
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.134525
Coefficient of variation (CV)0.31402215
Kurtosis-1.1992601
Mean45.011236
Median Absolute Deviation (MAD)12
Skewness-0.0016654567
Sum21380337
Variance199.78479
MonotonicityNot monotonic
2026-02-23T09:47:54.414093image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
439885
 
2.0%
659883
 
2.0%
379864
 
2.0%
569860
 
2.0%
359838
 
2.0%
369820
 
2.0%
429818
 
2.0%
259797
 
2.0%
579793
 
2.0%
589790
 
2.0%
Other values (39)376652
75.3%
(Missing)25000
 
5.0%
ValueCountFrequency (%)
219722
1.9%
229789
2.0%
239679
1.9%
249561
1.9%
259797
2.0%
269513
1.9%
279610
1.9%
289649
1.9%
299570
1.9%
309674
1.9%
ValueCountFrequency (%)
699578
1.9%
689778
2.0%
679761
2.0%
669517
1.9%
659883
2.0%
649706
1.9%
639453
1.9%
629619
1.9%
619756
2.0%
609774
2.0%

gender
Categorical

Missing 

Distinct3
Distinct (%)< 0.1%
Missing25000
Missing (%)5.0%
Memory size3.8 MiB
Female
228155 
Male
227893 
Other
 
18952

Length

Max length6
Median length5
Mean length5.0005516
Min length4

Characters and Unicode

Total characters2375262
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female228155
45.6%
Male227893
45.6%
Other18952
 
3.8%
(Missing)25000
 
5.0%

Length

2026-02-23T09:47:54.499299image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:54.571861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
female228155
48.0%
male227893
48.0%
other18952
 
4.0%

Most occurring characters

ValueCountFrequency (%)
e703155
29.6%
a456048
19.2%
l456048
19.2%
F228155
 
9.6%
m228155
 
9.6%
M227893
 
9.6%
O18952
 
0.8%
t18952
 
0.8%
h18952
 
0.8%
r18952
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)2375262
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e703155
29.6%
a456048
19.2%
l456048
19.2%
F228155
 
9.6%
m228155
 
9.6%
M227893
 
9.6%
O18952
 
0.8%
t18952
 
0.8%
h18952
 
0.8%
r18952
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2375262
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e703155
29.6%
a456048
19.2%
l456048
19.2%
F228155
 
9.6%
m228155
 
9.6%
M227893
 
9.6%
O18952
 
0.8%
t18952
 
0.8%
h18952
 
0.8%
r18952
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2375262
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e703155
29.6%
a456048
19.2%
l456048
19.2%
F228155
 
9.6%
m228155
 
9.6%
M227893
 
9.6%
O18952
 
0.8%
t18952
 
0.8%
h18952
 
0.8%
r18952
 
0.8%

region
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
South
125341 
East
125283 
West
125058 
North
124318 

Length

Max length5
Median length4
Mean length4.499318
Min length4

Characters and Unicode

Total characters2249659
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEast
2nd rowSouth
3rd rowNorth
4th rowNorth
5th rowEast

Common Values

ValueCountFrequency (%)
South125341
25.1%
East125283
25.1%
West125058
25.0%
North124318
24.9%

Length

2026-02-23T09:47:54.638800image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:54.691547image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
south125341
25.1%
east125283
25.1%
west125058
25.0%
north124318
24.9%

Most occurring characters

ValueCountFrequency (%)
t500000
22.2%
s250341
11.1%
h249659
11.1%
o249659
11.1%
u125341
 
5.6%
S125341
 
5.6%
E125283
 
5.6%
a125283
 
5.6%
W125058
 
5.6%
e125058
 
5.6%
Other values (2)248636
11.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)2249659
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t500000
22.2%
s250341
11.1%
h249659
11.1%
o249659
11.1%
u125341
 
5.6%
S125341
 
5.6%
E125283
 
5.6%
a125283
 
5.6%
W125058
 
5.6%
e125058
 
5.6%
Other values (2)248636
11.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2249659
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t500000
22.2%
s250341
11.1%
h249659
11.1%
o249659
11.1%
u125341
 
5.6%
S125341
 
5.6%
E125283
 
5.6%
a125283
 
5.6%
W125058
 
5.6%
e125058
 
5.6%
Other values (2)248636
11.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2249659
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t500000
22.2%
s250341
11.1%
h249659
11.1%
o249659
11.1%
u125341
 
5.6%
S125341
 
5.6%
E125283
 
5.6%
a125283
 
5.6%
W125058
 
5.6%
e125058
 
5.6%
Other values (2)248636
11.1%

education_level
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
Graduate
175082 
Secondary
174326 
Post-Graduate
75462 
Primary
75130 

Length

Max length13
Median length9
Mean length8.953012
Min length7

Characters and Unicode

Total characters4476506
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSecondary
2nd rowGraduate
3rd rowSecondary
4th rowSecondary
5th rowGraduate

Common Values

ValueCountFrequency (%)
Graduate175082
35.0%
Secondary174326
34.9%
Post-Graduate75462
15.1%
Primary75130
15.0%

Length

2026-02-23T09:47:54.762879image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:54.817652image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
graduate175082
35.0%
secondary174326
34.9%
post-graduate75462
15.1%
primary75130
15.0%

Most occurring characters

ValueCountFrequency (%)
a750544
16.8%
r575130
12.8%
d424870
9.5%
e424870
9.5%
t326006
7.3%
G250544
 
5.6%
u250544
 
5.6%
o249788
 
5.6%
y249456
 
5.6%
c174326
 
3.9%
Other values (7)800428
17.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)4476506
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a750544
16.8%
r575130
12.8%
d424870
9.5%
e424870
9.5%
t326006
7.3%
G250544
 
5.6%
u250544
 
5.6%
o249788
 
5.6%
y249456
 
5.6%
c174326
 
3.9%
Other values (7)800428
17.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4476506
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a750544
16.8%
r575130
12.8%
d424870
9.5%
e424870
9.5%
t326006
7.3%
G250544
 
5.6%
u250544
 
5.6%
o249788
 
5.6%
y249456
 
5.6%
c174326
 
3.9%
Other values (7)800428
17.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4476506
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a750544
16.8%
r575130
12.8%
d424870
9.5%
e424870
9.5%
t326006
7.3%
G250544
 
5.6%
u250544
 
5.6%
o249788
 
5.6%
y249456
 
5.6%
c174326
 
3.9%
Other values (7)800428
17.9%

employment_type
Categorical

Missing 

Distinct3
Distinct (%)< 0.1%
Missing25000
Missing (%)5.0%
Memory size3.8 MiB
Salaried
285466 
Self-Employed
118933 
Unemployed
70601 

Length

Max length13
Median length8
Mean length9.5491937
Min length8

Characters and Unicode

Total characters4535867
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSalaried
2nd rowSelf-Employed
3rd rowSalaried
4th rowSalaried
5th rowSelf-Employed

Common Values

ValueCountFrequency (%)
Salaried285466
57.1%
Self-Employed118933
23.8%
Unemployed70601
 
14.1%
(Missing)25000
 
5.0%

Length

2026-02-23T09:47:54.888017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:54.937317image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
salaried285466
60.1%
self-employed118933
25.0%
unemployed70601
 
14.9%

Most occurring characters

ValueCountFrequency (%)
e664534
14.7%
l593933
13.1%
a570932
12.6%
d475000
10.5%
S404399
8.9%
r285466
6.3%
i285466
6.3%
p189534
 
4.2%
y189534
 
4.2%
m189534
 
4.2%
Other values (6)687535
15.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)4535867
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e664534
14.7%
l593933
13.1%
a570932
12.6%
d475000
10.5%
S404399
8.9%
r285466
6.3%
i285466
6.3%
p189534
 
4.2%
y189534
 
4.2%
m189534
 
4.2%
Other values (6)687535
15.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4535867
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e664534
14.7%
l593933
13.1%
a570932
12.6%
d475000
10.5%
S404399
8.9%
r285466
6.3%
i285466
6.3%
p189534
 
4.2%
y189534
 
4.2%
m189534
 
4.2%
Other values (6)687535
15.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4535867
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e664534
14.7%
l593933
13.1%
a570932
12.6%
d475000
10.5%
S404399
8.9%
r285466
6.3%
i285466
6.3%
p189534
 
4.2%
y189534
 
4.2%
m189534
 
4.2%
Other values (6)687535
15.2%

annual_income
Real number (ℝ)

Missing 

Distinct473709
Distinct (%)99.7%
Missing25000
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean549733.39
Minimum15700.25
Maximum16228345
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:55.012090image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum15700.25
5-th percentile165058.11
Q1296246.75
median445371.72
Q3670686.66
95-th percentile1244206.6
Maximum16228345
Range16212645
Interquartile range (IQR)374439.91

Descriptive statistics

Standard deviation439483.33
Coefficient of variation (CV)0.79944812
Kurtosis75.619804
Mean549733.39
Median Absolute Deviation (MAD)173902.5
Skewness5.5248683
Sum2.6112336 × 1011
Variance1.931456 × 1011
MonotonicityNot monotonic
2026-02-23T09:47:55.104597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
555412.423
 
< 0.1%
431574.263
 
< 0.1%
640402.613
 
< 0.1%
597865.843
 
< 0.1%
351330.783
 
< 0.1%
533790.492
 
< 0.1%
441238.822
 
< 0.1%
223062.092
 
< 0.1%
207109.282
 
< 0.1%
246767.022
 
< 0.1%
Other values (473699)474975
95.0%
(Missing)25000
 
5.0%
ValueCountFrequency (%)
15700.251
< 0.1%
28921.661
< 0.1%
29899.251
< 0.1%
30214.111
< 0.1%
31358.691
< 0.1%
33013.981
< 0.1%
33905.191
< 0.1%
34703.651
< 0.1%
35008.551
< 0.1%
35364.691
< 0.1%
ValueCountFrequency (%)
16228345.41
< 0.1%
14791672.271
< 0.1%
14643400.721
< 0.1%
14329300.31
< 0.1%
13790718.381
< 0.1%
13676246.711
< 0.1%
13644771.711
< 0.1%
13576957.991
< 0.1%
13188909.651
< 0.1%
13001217.011
< 0.1%

loan_amount
Real number (ℝ)

Distinct496793
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean224278.75
Minimum3331.74
Maximum7449221.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:55.196274image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum3331.74
5-th percentile43787.844
Q195226.975
median162964.3
Q3279429.81
95-th percentile606722.88
Maximum7449221.5
Range7445889.8
Interquartile range (IQR)184202.83

Descriptive statistics

Standard deviation212115.15
Coefficient of variation (CV)0.9457657
Kurtosis30.472487
Mean224278.75
Median Absolute Deviation (MAD)81174.335
Skewness3.6908967
Sum1.1213938 × 1011
Variance4.4992838 × 1010
MonotonicityNot monotonic
2026-02-23T09:47:55.288823image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
110454.333
 
< 0.1%
89797.513
 
< 0.1%
310885.663
 
< 0.1%
375241.43
 
< 0.1%
75520.623
 
< 0.1%
133367.093
 
< 0.1%
119189.813
 
< 0.1%
66912.43
 
< 0.1%
139348.453
 
< 0.1%
213977.123
 
< 0.1%
Other values (496783)499970
> 99.9%
ValueCountFrequency (%)
3331.741
< 0.1%
3470.641
< 0.1%
3749.431
< 0.1%
4461.161
< 0.1%
4909.591
< 0.1%
5214.441
< 0.1%
5468.181
< 0.1%
5746.191
< 0.1%
5916.131
< 0.1%
5990.341
< 0.1%
ValueCountFrequency (%)
7449221.521
< 0.1%
7121808.621
< 0.1%
4982447.311
< 0.1%
4647066.591
< 0.1%
4641858.471
< 0.1%
4486562.431
< 0.1%
4457180.621
< 0.1%
4303462.21
< 0.1%
4254652.571
< 0.1%
4218754.741
< 0.1%

loan_purpose
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
Education
100418 
Business
100220 
Other
100204 
Car
100156 
Home
99002 

Length

Max length9
Median length5
Mean length5.806036
Min length3

Characters and Unicode

Total characters2903018
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEducation
2nd rowCar
3rd rowHome
4th rowCar
5th rowCar

Common Values

ValueCountFrequency (%)
Education100418
20.1%
Business100220
20.0%
Other100204
20.0%
Car100156
20.0%
Home99002
19.8%

Length

2026-02-23T09:47:55.377427image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:55.431549image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
education100418
20.1%
business100220
20.0%
other100204
20.0%
car100156
20.0%
home99002
19.8%

Most occurring characters

ValueCountFrequency (%)
s300660
 
10.4%
e299426
 
10.3%
n200638
 
6.9%
i200638
 
6.9%
u200638
 
6.9%
t200622
 
6.9%
a200574
 
6.9%
r200360
 
6.9%
o199420
 
6.9%
E100418
 
3.5%
Other values (8)799624
27.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)2903018
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s300660
 
10.4%
e299426
 
10.3%
n200638
 
6.9%
i200638
 
6.9%
u200638
 
6.9%
t200622
 
6.9%
a200574
 
6.9%
r200360
 
6.9%
o199420
 
6.9%
E100418
 
3.5%
Other values (8)799624
27.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2903018
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s300660
 
10.4%
e299426
 
10.3%
n200638
 
6.9%
i200638
 
6.9%
u200638
 
6.9%
t200622
 
6.9%
a200574
 
6.9%
r200360
 
6.9%
o199420
 
6.9%
E100418
 
3.5%
Other values (8)799624
27.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2903018
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s300660
 
10.4%
e299426
 
10.3%
n200638
 
6.9%
i200638
 
6.9%
u200638
 
6.9%
t200622
 
6.9%
a200574
 
6.9%
r200360
 
6.9%
o199420
 
6.9%
E100418
 
3.5%
Other values (8)799624
27.5%

credit_score
Real number (ℝ)

Missing 

Distinct40920
Distinct (%)8.6%
Missing25000
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean649.87077
Minimum300
Maximum850
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:55.520530image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum300
5-th percentile518.59
Q1596
median649.93
Q3703.84
95-th percentile781.87
Maximum850
Range550
Interquartile range (IQR)107.84

Descriptive statistics

Standard deviation79.530061
Coefficient of variation (CV)0.12237827
Kurtosis-0.11694106
Mean649.87077
Median Absolute Deviation (MAD)53.92
Skewness-0.039704784
Sum3.0868862 × 108
Variance6325.0306
MonotonicityNot monotonic
2026-02-23T09:47:55.606108image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8502895
 
0.6%
632.9643
 
< 0.1%
688.1641
 
< 0.1%
628.1140
 
< 0.1%
636.0440
 
< 0.1%
618.8740
 
< 0.1%
658.3339
 
< 0.1%
646.439
 
< 0.1%
628.9139
 
< 0.1%
623.6638
 
< 0.1%
Other values (40910)471746
94.3%
(Missing)25000
 
5.0%
ValueCountFrequency (%)
3001
< 0.1%
302.141
< 0.1%
302.951
< 0.1%
303.681
< 0.1%
305.011
< 0.1%
310.321
< 0.1%
310.371
< 0.1%
317.031
< 0.1%
318.511
< 0.1%
320.841
< 0.1%
ValueCountFrequency (%)
8502895
0.6%
849.991
 
< 0.1%
849.982
 
< 0.1%
849.974
 
< 0.1%
849.951
 
< 0.1%
849.941
 
< 0.1%
849.932
 
< 0.1%
849.921
 
< 0.1%
849.912
 
< 0.1%
849.873
 
< 0.1%

repayment_history
Real number (ℝ)

Zeros 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.999366
Minimum0
Maximum13
Zeros67883
Zeros (%)13.6%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:55.677353image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum13
Range13
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4142778
Coefficient of variation (CV)0.70736312
Kurtosis0.51765062
Mean1.999366
Median Absolute Deviation (MAD)1
Skewness0.7086332
Sum999683
Variance2.0001816
MonotonicityNot monotonic
2026-02-23T09:47:55.749264image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2135562
27.1%
1134979
27.0%
390208
18.0%
067883
13.6%
445184
 
9.0%
517902
 
3.6%
65977
 
1.2%
71755
 
0.4%
8420
 
0.1%
9105
 
< 0.1%
Other values (4)25
 
< 0.1%
ValueCountFrequency (%)
067883
13.6%
1134979
27.0%
2135562
27.1%
390208
18.0%
445184
 
9.0%
517902
 
3.6%
65977
 
1.2%
71755
 
0.4%
8420
 
0.1%
9105
 
< 0.1%
ValueCountFrequency (%)
131
 
< 0.1%
121
 
< 0.1%
113
 
< 0.1%
1020
 
< 0.1%
9105
 
< 0.1%
8420
 
0.1%
71755
 
0.4%
65977
 
1.2%
517902
 
3.6%
445184
9.0%

transaction_count
Real number (ℝ)

Distinct65
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.008804
Minimum20
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:55.828514image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile39
Q145
median50
Q355
95-th percentile62
Maximum85
Range65
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.063937
Coefficient of variation (CV)0.14125387
Kurtosis0.029693761
Mean50.008804
Median Absolute Deviation (MAD)5
Skewness0.1439653
Sum25004402
Variance49.899206
MonotonicityNot monotonic
2026-02-23T09:47:55.915130image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5028225
 
5.6%
4928162
 
5.6%
4827773
 
5.6%
5127518
 
5.5%
5226655
 
5.3%
4726514
 
5.3%
4624942
 
5.0%
5324931
 
5.0%
5423119
 
4.6%
4523074
 
4.6%
Other values (55)239087
47.8%
ValueCountFrequency (%)
202
 
< 0.1%
222
 
< 0.1%
235
 
< 0.1%
2413
 
< 0.1%
2522
 
< 0.1%
2635
 
< 0.1%
2762
 
< 0.1%
28117
 
< 0.1%
29223
< 0.1%
30346
0.1%
ValueCountFrequency (%)
852
 
< 0.1%
841
 
< 0.1%
832
 
< 0.1%
822
 
< 0.1%
8111
 
< 0.1%
807
 
< 0.1%
7924
 
< 0.1%
7830
 
< 0.1%
7750
< 0.1%
7679
< 0.1%

spending_ratio
Real number (ℝ)

Distinct8384
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.086239
Minimum5
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.8 MiB
2026-02-23T09:47:56.004759image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile15.37
Q129.93
median40.02
Q350.16
95-th percentile64.73
Maximum100
Range95
Interquartile range (IQR)20.23

Descriptive statistics

Standard deviation14.875281
Coefficient of variation (CV)0.37108198
Kurtosis-0.16384105
Mean40.086239
Median Absolute Deviation (MAD)10.12
Skewness0.063279303
Sum20043120
Variance221.27399
MonotonicityNot monotonic
2026-02-23T09:47:56.089817image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
54880
 
1.0%
39.63176
 
< 0.1%
36.95167
 
< 0.1%
46165
 
< 0.1%
42.57164
 
< 0.1%
40.15162
 
< 0.1%
38.72162
 
< 0.1%
34.26161
 
< 0.1%
38.46161
 
< 0.1%
39.43160
 
< 0.1%
Other values (8374)493642
98.7%
ValueCountFrequency (%)
54880
1.0%
5.0110
 
< 0.1%
5.0211
 
< 0.1%
5.0313
 
< 0.1%
5.047
 
< 0.1%
5.0510
 
< 0.1%
5.064
 
< 0.1%
5.0713
 
< 0.1%
5.088
 
< 0.1%
5.092
 
< 0.1%
ValueCountFrequency (%)
10013
< 0.1%
99.561
 
< 0.1%
99.531
 
< 0.1%
99.391
 
< 0.1%
99.171
 
< 0.1%
98.621
 
< 0.1%
98.31
 
< 0.1%
97.871
 
< 0.1%
97.611
 
< 0.1%
97.481
 
< 0.1%
Distinct3650
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
Minimum2015-01-01 00:00:00
Maximum2024-12-28 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2026-02-23T09:47:56.175930image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:56.268846image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

default_flag
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
0
424435 
1
75565 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters500000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Length

2026-02-23T09:47:56.360879image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-23T09:47:56.625772image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Most occurring characters

ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)500000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)500000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)500000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0424435
84.9%
175565
 
15.1%

Interactions

2026-02-23T09:47:51.115343image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.132693image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.107353image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.049809image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.058323image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.157528image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.132388image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.115999image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.231275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.255702image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.226621image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.181505image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.182042image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.280207image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.256323image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.239234image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.347849image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.375564image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.339047image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.313397image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.304231image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.401042image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.373178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.355370image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.467579image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.498826image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.472809image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.437584image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.551798image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.538875image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.498027image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.476942image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.598407image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.621656image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.588222image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.564164image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.679320image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.661084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.630638image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.596881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.746317image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.752723image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.705545image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.686567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.802393image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.777992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.752842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.721104image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.866091image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.867827image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.822527image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.809683image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:47.925611image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.895853image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.872328image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.845315image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:51.982492image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:44.984488image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:45.942510image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:46.942786image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:48.042286image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:49.014698image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.004879image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2026-02-23T09:47:50.990846image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2026-02-23T09:47:56.683785image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ageannual_incomecredit_scorecustomer_iddefault_flageducation_levelemployment_typegenderloan_amountloan_purposeregionrepayment_historyspending_ratiotransaction_count
age1.0000.0010.0010.0000.0040.0020.0000.0000.0000.0000.002-0.0020.0010.001
annual_income0.0011.000-0.0010.0010.0000.0010.0000.003-0.0020.0010.0000.0000.0020.001
credit_score0.001-0.0011.000-0.0020.0000.0000.0030.0000.0010.0000.000-0.0010.0010.001
customer_id0.0000.001-0.0021.0000.0040.0000.0000.000-0.0010.0020.000-0.001-0.000-0.001
default_flag0.0040.0000.0000.0041.0000.0000.0000.0030.0000.0000.0040.0000.0000.000
education_level0.0020.0010.0000.0000.0001.0000.0000.0000.0000.0010.0000.0020.0010.003
employment_type0.0000.0000.0030.0000.0000.0001.0000.0010.0010.0020.0020.0010.0000.000
gender0.0000.0030.0000.0000.0030.0000.0011.0000.0050.0050.0020.0000.0020.001
loan_amount0.000-0.0020.001-0.0010.0000.0000.0010.0051.0000.0010.0010.0010.002-0.001
loan_purpose0.0000.0010.0000.0020.0000.0010.0020.0050.0011.0000.0000.0020.0020.000
region0.0020.0000.0000.0000.0040.0000.0020.0020.0010.0001.0000.0000.0000.000
repayment_history-0.0020.000-0.001-0.0010.0000.0020.0010.0000.0010.0020.0001.000-0.0000.001
spending_ratio0.0010.0020.001-0.0000.0000.0010.0000.0020.0020.0020.000-0.0001.000-0.002
transaction_count0.0010.0010.001-0.0010.0000.0030.0000.001-0.0010.0000.0000.001-0.0021.000

Missing values

2026-02-23T09:47:52.181114image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2026-02-23T09:47:52.589399image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2026-02-23T09:47:53.412528image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

customer_idagegenderregioneducation_levelemployment_typeannual_incomeloan_amountloan_purposecredit_scorerepayment_historytransaction_countspending_ratiojoin_datedefault_flag
010000059.0FemaleEastSecondarySalaried240079.5492740.78Education784.1715341.302016-10-060
110000149.0FemaleSouthGraduateSelf-Employed438923.3064315.33Car589.7016212.192022-04-010
210000235.0FemaleNorthSecondarySalaried424122.06632481.94Home625.7924523.682024-12-240
310000363.0FemaleNorthSecondarySalaried322274.92118465.97Car627.4825732.662021-03-170
410000428.0FemaleEastGraduateSelf-Employed1371925.76131836.27Car803.1814615.402024-04-251
5100005NaNMaleNorthGraduateSalaried140205.29636816.05CarNaN26347.892020-08-110
6100006NaNMaleSouthSecondarySalariedNaN54893.56Education618.8103930.482022-01-120
710000739.0MaleNorthPrimarySalaried884965.70112443.67Education789.8324848.402017-10-230
810000843.0MaleNorthGraduateSalaried410172.80207545.02Car557.8526135.562020-12-020
910000931.0MaleEastGraduateSelf-Employed731759.92213505.15Business596.2025629.382019-08-070
customer_idagegenderregioneducation_levelemployment_typeannual_incomeloan_amountloan_purposecredit_scorerepayment_historytransaction_countspending_ratiojoin_datedefault_flag
49999059999041.0MaleEastSecondarySelf-Employed397384.54233648.49Home771.7105739.472022-02-240
49999159999156.0MaleNorthGraduateSalaried373832.46176198.19Car602.2403830.832018-03-210
49999259999235.0FemaleEastGraduateSalariedNaN343362.36Car532.3125744.992019-09-250
49999359999367.0FemaleWestSecondarySalaried1536894.34199143.32Car605.0704542.092020-07-270
49999459999467.0MaleSouthPrimarySelf-Employed374245.04279921.48Education692.4354648.842016-11-270
49999559999531.0FemaleWestGraduateSalaried591909.3289253.73Education627.7403759.932024-03-081
49999659999663.0MaleNorthSecondaryUnemployed983386.51119731.07Business771.3125911.372016-06-220
49999759999763.0OtherSouthSecondarySalaried280465.76340991.05Education663.0725548.062021-04-160
49999859999831.0MaleNorthPrimarySalaried304002.4975333.63Car718.9745837.982024-05-080
49999959999939.0FemaleEastSecondarySalaried259383.79480386.08CarNaN13922.532023-07-160